Goto

Collaborating Authors

 practical method


Practical Methods for Graph Two-Sample Testing

Neural Information Processing Systems

Hypothesis testing for graphs has been an important tool in applied research fields for more than two decades, and still remains a challenging problem as one often needs to draw inference from few replicates of large graphs. Recent studies in statistics and learning theory have provided some theoretical insights about such high-dimensional graph testing problems, but the practicality of the developed theoretical methods remains an open question. In this paper, we consider the problem of two-sample testing of large graphs. We demonstrate the practical merits and limitations of existing theoretical tests and their bootstrapped variants. We also propose two new tests based on asymptotic distributions. We show that these tests are computationally less expensive and, in some cases, more reliable than the existing methods.


Reviews: Rényi Divergence Variational Inference

Neural Information Processing Systems

This is a very good and technically sound paper, containing a significant amount of material. The theoretical investigation of the properties of alpha-divergence minimization is thorough, clear and detailed. The paper provides significant theoretical insight and understanding into alpha-divergence minimization and optimization-based approximate inference in general. My biggest concern about the alpha-divergence framework is whether its theoretical richness and elegance actually translates to practical methods. In other words, I'm not sure that the practical aspects of it are appealing enough to convince practitioners of variational inference to switch to alpha-divergence minimization instead.


Reviews: Practical Methods for Graph Two-Sample Testing

Neural Information Processing Systems

This paper studies the problem of two-sample testing of large graphs under the inhomogeneous Erdos Renyi model. This model is pretty generic, and assumes that an undirected edge (ij) is in the graph with probability P_{ij} independently of all other edges. Most generically the parameter matrix P could be anything symmetric (zero diagonal), but common models are stochastic block model or mixed membership stochastic block model, which both result in P being low rank. Suppose there were two random graph distributions, parameterized by matrices P and Q, and the goal is to test whether P Q or not (the null hypothesis being that they are equal). They assume that the graphs are vertex-aligned, which helps as it reduces the problem of searching over permutations to align the graphs.


A practical method for occupational skills detection in Vietnamese job listings

Tran, Viet-Trung, Cao, Hai-Nam, Cao, Tuan-Dung

arXiv.org Artificial Intelligence

Vietnamese labor market has been under an imbalanced development. The number of university graduates is growing, but so is the unemployment rate. This situation is often caused by the lack of accurate and timely labor market information, which leads to skill miss-matches between worker supply and the actual market demands. To build a data monitoring and analytic platform for the labor market, one of the main challenges is to be able to automatically detect occupational skills from labor-related data, such as resumes and job listings. Traditional approaches rely on existing taxonomy and/or large annotated data to build Named Entity Recognition (NER) models. They are expensive and require huge manual efforts. In this paper, we propose a practical methodology for skill detection in Vietnamese job listings. Rather than viewing the task as a NER task, we consider the task as a ranking problem. We propose a pipeline in which phrases are first extracted and ranked in semantic similarity with the phrases' contexts. Then we employ a final classification to detect skill phrases. We collected three datasets and conducted extensive experiments. The results demonstrated that our methodology achieved better performance than a NER model in scarce datasets.


Python Dictionary: 10 Practical Methods You Need to Know

#artificialintelligence

Originally published on Towards AI the World's Leading AI and Technology News and Media Company. If you are building an AI-related product or service, we invite you to consider becoming an AI sponsor. At Towards AI, we help scale AI and technology startups. Let us help you unleash your technology to the masses. It's free, we don't spam, and we never share your email address.


A Practical Method for Constructing Equivariant Multilayer Perceptrons for Arbitrary Matrix Groups

Finzi, Marc, Welling, Max, Wilson, Andrew Gordon

arXiv.org Machine Learning

Symmetries and equivariance are fundamental to the generalization of neural networks on domains such as images, graphs, and point clouds. Existing work has primarily focused on a small number of groups, such as the translation, rotation, and permutation groups. In this work we provide a completely general algorithm for solving for the equivariant layers of matrix groups. In addition to recovering solutions from other works as special cases, we construct multilayer perceptrons equivariant to multiple groups that have never been tackled before, including $\mathrm{O}(1,3)$, $\mathrm{O}(5)$, $\mathrm{Sp}(n)$, and the Rubik's cube group. Our approach outperforms non-equivariant baselines, with applications to particle physics and dynamical systems. We release our software library to enable researchers to construct equivariant layers for arbitrary matrix groups.


Introducing GIG: A Practical Method for Explaining Diverse Ensemble Machine Learning Models

#artificialintelligence

Machine learning is proven to yield better underwriting results and mitigate bias in lending. But not all machine learning techniques, including the wide swath at work in unregulated uses, is built to be transparent. Many of the algorithms that get deployed generate results that are difficult to explain. Recently, researchers have proposed novel and powerful methods for explaining machine learning models, notably Shapley Additive Explanations (SHAP Explainers) and Integrated Gradients (IG). These methods provide mechanisms for assigning credit to the data variables used by a model to generate a score.


Practical Methods for Graph Two-Sample Testing

Ghoshdastidar, Debarghya, Luxburg, Ulrike von

Neural Information Processing Systems

Hypothesis testing for graphs has been an important tool in applied research fields for more than two decades, and still remains a challenging problem as one often needs to draw inference from few replicates of large graphs. Recent studies in statistics and learning theory have provided some theoretical insights about such high-dimensional graph testing problems, but the practicality of the developed theoretical methods remains an open question. In this paper, we consider the problem of two-sample testing of large graphs. We demonstrate the practical merits and limitations of existing theoretical tests and their bootstrapped variants. We also propose two new tests based on asymptotic distributions.


Introducing Generalized Integrated Gradients (GIG): A Practical Method for Explaining Diverse Ensemble Machine Learning Models - KDnuggets

#artificialintelligence

Machine learning is proven to yield better underwriting results and mitigate bias in lending. But not all machine learning techniques, including the wide swath at work in unregulated uses, is built to be transparent. Many of the algorithms that get deployed generate results that are difficult to explain. Recently, researchers have proposed novel and powerful methods for explaining machine learning models, notably Shapley Additive Explanations (SHAP Explainers) and Integrated Gradients (IG). These methods provide mechanisms for assigning credit to the data variables used by a model to generate a score.


Data Science with Java: Practical Methods for Scientists and Engineers 1, PhD Michael R. Brzustowicz, eBook - Amazon.com

@machinelearnbot

This book is for scientists and engineers already familiar with the concepts of application development who want to jump headfirst into data science. The topics covered here will walk you through the data science pipeline, explaining mathematical theory and giving code examples along the way. This book is the perfect jumping-off point into much deeper waters. I wrote this book to start a movement. As data science skyrockets to stardom, fueled by R and Python, very few practitioners venture into the world of Java.